AITopics | attention memory

Visual Reference Resolution using Attention Memory for Visual Dialog

Neural Information Processing SystemsNov-21-2025, 15:27:29 GMT

Visual dialog is a task of answering a series of inter-dependent questions given an input image, and often requires to resolve visual references among the questions. This problem is different from visual question answering (VQA), which relies on spatial attention ({\em a.k.a.

attention memory, name change, visual reference resolution, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.56)

Add feedback

Visual Reference Resolution using Attention Memory for Visual Dialog

Paul Hongsuck Seo, Andreas Lehrmann, Bohyung Han, Leonid Sigal

Neural Information Processing SystemsNov-21-2025, 10:12:09 GMT

Neural Information Processing Systems http://nips.cc/

history, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Visual Reference Resolution using Attention Memory for Visual Dialog

Paul Hongsuck Seo, Andreas Lehrmann, Bohyung Han, Leonid Sigal

Neural Information Processing SystemsOct-3-2024, 13:14:17 GMT

Neural Information Processing Systems http://nips.cc/

dialog, history, reference resolution, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Neural Attention Memory

Nam, Hyoungwook, Seo, Seung Byum

arXiv.org Artificial IntelligenceOct-14-2023

We propose a novel perspective of the attention mechanism by reinventing it as a memory architecture for neural networks, namely Neural Attention Memory (NAM). NAM is a memory structure that is both readable and writable via differentiable linear algebra operations. We explore three use cases of NAM: memory-augmented neural network (MANN), few-shot learning, and efficient long-range attention. First, we design two NAM-based MANNs of Long Short-term Memory (LSAM) and NAM Turing Machine (NAM-TM) that show better computational powers in algorithmic zero-shot generalization tasks compared to other baselines such as differentiable neural computer (DNC). Next, we apply NAM to the N-way K-shot learning task and show that it is more effective at reducing false positives compared to the baseline cosine classifier. Finally, we implement an efficient Transformer with NAM and evaluate it with long-range arena tasks to show that NAM can be an efficient and effective alternative for scaled dot-product attention.

attention mechanism, mechanism, transformer, (14 more...)

arXiv.org Artificial Intelligence

2302.09422

Country: North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Visual Reference Resolution using Attention Memory for Visual Dialog

Seo, Paul Hongsuck, Lehrmann, Andreas, Han, Bohyung, Sigal, Leonid

Neural Information Processing SystemsFeb-14-2020, 13:43:25 GMT

Visual dialog is a task of answering a series of inter-dependent questions given an input image, and often requires to resolve visual references among the questions. This problem is different from visual question answering (VQA), which relies on spatial attention ({\em a.k.a. We propose a novel attention mechanism that exploits visual attentions in the past to resolve the current reference in the visual dialog scenario. The proposed model is equipped with an associative attention memory storing a sequence of previous (attention, key) pairs. From this memory, the model retrieves previous attention, taking into account recency, that is most relevant for the current question, in order to resolve potentially ambiguous reference(s).

attention memory, visual dialog, visual reference resolution, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.64)

Add feedback

Visual Reference Resolution using Attention Memory for Visual Dialog

Seo, Paul Hongsuck, Lehrmann, Andreas, Han, Bohyung, Sigal, Leonid

Neural Information Processing SystemsDec-31-2017

Visual dialog is a task of answering a series of inter-dependent questions given an input image, and often requires to resolve visual references among the questions. This problem is different from visual question answering (VQA), which relies on spatial attention ({\em a.k.a. visual grounding}) estimated from an image and question pair. We propose a novel attention mechanism that exploits visual attentions in the past to resolve the current reference in the visual dialog scenario. The proposed model is equipped with an associative attention memory storing a sequence of previous (attention, key) pairs. From this memory, the model retrieves previous attention, taking into account recency, that is most relevant for the current question, in order to resolve potentially ambiguous reference(s). The model then merges the retrieved attention with the tentative one to obtain the final attention for the current question; specifically, we use dynamic parameter prediction to combine the two attentions conditioned on the question. Through extensive experiments on a new synthetic visual dialog dataset, we show that our model significantly outperforms the state-of-the-art (by ~16 % points) in the situation where the visual reference resolution plays an important role. Moreover, the proposed model presents superior performance (~2 % points improvement) in the Visual Dialog dataset, despite having significantly fewer parameters than the baselines.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Technology: